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Abstract 

We study variational regularization methods in a general framework, 
more precisely those methods that use a discrepancy and a regularization 
functional. While several sets of sufficient conditions are known to obtain 
a regularization method, we start with an investigation of the converse 
question: What are necessary conditions for a variational method to pro- 
vide a regularization method? To this end, we formalize the notion of 
a variational scheme and compare three different instances of variational 
methods. Then we focus on the data space model and investigate the role 
and interplay of the topological structure, the convergence notion and the 
discrepancy functional. Especially, we deduce necessary conditions for the 
discrepancy functional to fulfill usual continuity assumptions. 
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1 Introduction 

By "variational regularization" we mean every method that is used to approxi- 
mate an ill-posed problem by well-posed minimization problems. We start with 
a mapping F : X — » Y between two sets X and Y and equations 

Fx = y. 

A common problem with inverse problems is that of instability, i.e. that arbitrary 
small disturbances in the right hand side y (e.g. by replacing a "correct" y in the 
range of F with one in an arbitrarily small neighborhood) may lead to unwanted 
effects such as that no solution exists anymore or that solutions with perturbed 
right hand side differ arbitrarily from the true solutions. In topological spaces X 
and Y we can formulate the problem of instability more precisely: The equation 
Fx exact = y exact is unstable, if there exists a neighborhood U of £ exact such that 
for all neighborhoods V of y exact there exists y s & V such that F^ 1 (y s ) C\U = 

(cf. M)- 
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Variational regularization methods replace the equation Fx — y by a mini- 
mization problem for an (extended) real valued functional such that the mini- 
mizers are suitable approximate solutions of the equation. The most widely used 
variational method is Tikhonov regularization [32] , but other methods are used 
as well. Starting from a detailed analysis of this method in Hilbert spaces, there 
are several recent studies on Tikhonov regularization in the context of more gen- 
eral spaces like Banach spaces [T51l2"5] or even topological spaces pnfT21HBl[?7] . 
These works provide a quite general set of sufficient assumptions under which 
Tikhonov regularization has the desired regularizing properties, i.e. stable solv- 
ability of the minimization problems and suitable approximation of the true 
solution if the noise vanishes. These sufficient assumptions are helpful to check 
if a chosen setting for variational regularization is indeed suited. On the other 
hand, when designing a regularization method it would be helpful to know in 
advance which setting works and which is not going to work. Hence, in this 
paper we begin with a study of the converse analysis and aim at providing nec- 
essary conditions on variational methods such that regularization is achieved. 
Such conditions will also be helpful in designing new variational methods as 
they rule out several options. Moreover, necessary conditions are a further step 
towards the understanding of the nature of variational regularization. 

We remark that we are aware that necessary conditions can not be expected 
to be very strong (as an example, a minimization problem can be changed quite 
arbitrary without changing the minimizer itself). However, there are already a 
few results of this flavor known in the context of the regularization of ill-posed 
problems which we list here: 

Theorem 1.1 (No uniform bounded linear regularization, jTOJ Remark 3.5]). 
// the linear and bounded operator F : X — > Y between Hilbert spaces X and 
Y does not have closed range and (L a ) a> o is a family of linear and bounded 
operators from Y to X such that for all x € X is holds that L a Fx converges to 
x for a —?• 0, then (||L Q ||) is unbounded. 

In other words, linear regularization methods are necessarily not uniformly 
bounded. 

The next example of a necessary condition deals with the problem of param- 
eter choice. We need the Moore-Penrose pseudo-inverse F> of a bounded linear 
mapping between Hilbert spaces, cf. [1]. 

Theorem 1.2 (Bakushinskii Veto, [5]). Let F : X — > Y be a bounded linear 
operator between Hilbert spaces and (L a ) a> Q be a family of continuous mappings 
from Y to X . If there is a mapping a : Y — > ]0, oo[ such that 

\imsup{\\L a{y s )y 5 -Ftj/H : y 5 g Y, \\y - y 6 \\ < 6} = 

(5— >0 

then F' is bounded. 

In other words, parameter choice rules which are valid in the worst-case 
setting and which work for ill-posed problems (i.e. unbounded F*) necessarily 
need to use the noise level. 

An example for a-priori parameter choice rule was proven by Engl: 

Theorem 1.3 (Decay conditions for a-priori parameter choice rules for linear 
methods, [101 Prop. 3.7] and [5])). Let F and (L Q ) be as in Theorem ] 1. 11 and 
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a : ]0, oof — > ]0, oo[ be an a-priori parameter choice rule. Then it holds that 
limsup{||L a(a) y a -Fty : y s e Y, \\y - y 5 \\ < 6} = 

5 -HI 

z/ and onZy i/ 

lim a(<5) = and lira 6\\L a (x\ II = 0. 

In other words, a-priori parameter choice rules necessarily need to fulfill 
certain decay conditions. 

Finally we mention the "converse results" from [53] which say that for 
Tikhonov regularization in Hilbert spaces certain convergence rates imply that 
certain source conditions arc fulfilled (see (T3] for generalization to other regu- 
larization methods). 

Before we start our investigation of necessary conditions for variational regu- 
larization in Section[3J we start with a section in which we formalize the notation 
of a "variational scheme" and investigate a few different variational methods. 

2 Variational schemes: Tikhonov, Morozov, and 
Ivanov methods 

In this section we formalize the notion of a variational scheme which can be used 
to build variational regularization methods. Basically, a variational scheme con- 
sists of all ingredients which are needed to classify and analyze the associated 
minimization problems and their minimizers under perturbations of the data y. 
Hence, it should encode information about the involved spaces and its notions 
of convergence and "proximity" , the forward operator, and the objective func- 
tional to be minimized. However, we do not allow for totally arbitrary objective 
functionals but we rather use the intuition that a variational scheme involves 
two functionals: a "similarity measure" or "discrepancy functional" p and a 
"regularization functional" R. The functional p is used to measure "similarity" 
in the data space in the sense that p(Fx, y) is small if x explains the data y 
well. The functional R on the solution space is used to measure how well x fits 
prior knowledge in the sense that R{x) is small for an x which fulfills the prior 
knowledge well. 

Definition 2.1 (Variational scheme). By a variational scheme we understand 
a tuple M = ((X, tx), (Y, Ty,S), p, R, F), consisting of 

• topological spaces X and Y equipped with topologies tx and ry, respec- 
tively, X is called solution space and Y is called data space, 

• a sequential convergence structure 5 on 7. 

That is, 5 is mapping which maps any element in Y to a set of sequences 
in Y such that the constant sequence (y) is an element of S(y) and that 
if a sequence is in S(y) then so does any of its subsequences. Usually, we 

denote (y n ) n £ S(y) by y n — » y and say that (y n ) converges to y (with 
respect to S), see also [3] §1.7]. 

• p : Y x Y — > [0, oo] is the discrepancy functional, for which we assume 
that p(y, y) = for all y ^Y, 
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• R : X — >• [0, oo] is the regularization functional, and 

• F : X — > Y is a mapping or forward operator. 

As in [T7J we use topological spaces since the functionals we consider do 
no take any linear structure into account which would justify the use of linear 
or normed spaces. While the role of most other ingredients of a variational 
scheme is obvious, we remark on the sequential convergence structure S: Of- 
ten decaying noise is described in terms of norm-convergence, a notion which 
is not available or not even appropriate (see, e.g. [H] and [37]). Therefore, the 
sequential convergence structure will be used to describe "vanishing noise" in 
Y , i.e. the vanishing of noise is modelled by convergence of a sequence (y n ) to 
noise free data y w.r.t. S. Note that we do not assume that convergence w.r.t. S 
is topological since this is not used in standard proofs for regularizing proper- 
ties (e.g. [IH])- Moreover, the topology Ty may induce a different convergence 
structure which is more tied to the mapping properties of F. In the case of a 
Banach space Y one can think of the following situation which is for example 
used in [19] : The sequential convergence structure is given by the convergence 
with respect to the norm in Y and Ty is the weak topology on Y. Of course, 
there will be further relations between ty, S and p in the following, and in- 
deed, Section [3] mainly deals with these relations, but for the general variational 
scheme we keep them mostly unrelated. 

We mention that we included the value oo in the range of the discrepancy 
functional p and the regularization functional R to model that certain data 
may be considered "incomparable" or that certain solutions may be impossible. 
As usual, the value oo is excluded for minimizers by definition and we use the 
notation domi? = {x : R(x) < oo} (similarly for p). 

Variational regularization methods can be build from variational schemes as 
follows. Instead of solving Fx = y we aim at two goals: Find an x £ X such 
that 

1. x explains the data y well, in the sense that p(Fx,y) is small, and 

2. x fits to our prior knowledge in the sense that R(x) is small. 

In other words: We have two objective functionals x h4 p(Fx, y) and x i— > R(x) 
which we would like to "jointly minimize" . Of course, in general this is not 
possible, however, such problems go under the name of "multicriteria" , "mul- 
tiobjective" of "vector optimization". A core notion there is that of "Pareto- 
optimal solutions", i.e., solutions x* such that there does not exists an x such 
that R(x) < R(x*) and p(Fx,y) < p(Fx*,y) and one of both inequalities is 
strict 6, §4.7]. Note that for "exact data", i.e. y cxact in the range of F, the 
notion of Pareto optimality induces a notion of generalized solutions of the 
equation Fx = y (see [12] for a slightly different notion): 

Definition 2.2. Let ((X, Tx), (Y, Ty, S), p, R, F) be a variational scheme and 
^cxact j n ^ ne ran g e f y . We say that a; is a p-generalized R-minimal solution 
of Fx = y cxact if p(Fx, y oxact ) = and R(x) = ram{R(x) : p(Fx, y cxact ) = 0}. 

Using the two objective functionals p(F-, y) and R we can build at least three 
different minimization problems which aim at finding Pareto optimal solutions. 
These three problems are well known in the inverse problems community and 
in fact can be traced back to the pioneering works in the Russian school: 



4 



• Tikhonov regularization [35]: For a > set T a ^ y (x) := p(Fx,y) + aR(x) 
and consider 

T a y (x) -> min . (T Q ) 

x£X 

In other words: Choose a weighting between "good data fit" and "good 
fit to prior knowledge" and minimize the weighted objective functional. 
In the context of multicriteria optimization this is known as scalarization. 

• Ivanov regularization [20]: For r > consider 

p(Fx,y) — > min s.t. R(x) < r. (I t ) 

x£X 

In other words: Choose the solution with the best data-fit which also fits 
the prior knowledge up to a predefined amount. 

• Morozov regularization [55] : For 5 > consider 

R(x) ->■ min s.t. p{Fx,y) < 5. (M 5 ) 

x£X 

In other words: Choose the solution which fits best the prior knowledge 
among the ones which explain the data up to a predefined amount. 

These methods are treated and compared e.g. in [21] Ch. 3.5] (where JIT] ) is called 
"method of quasi-solutions" and JMgD goes under the name "method of the 
residual") in the case of Banach spaces and p(Fx,y) = \\Fx — y\\ p and R(x) = 
|| Lx || q with a (possibly unbounded) linear operator L. Here we present results 
on the relation of the minimizers of these methods in our abstract framework 
of a variational scheme. Note that we do not pose any convexity assumptions 
on R or p. 

Theorem 2.3. Let M. be a variational scheme according to Definition \2.1\ 

1. If there exists a unique solution x T of JIT] ) which fulfills R(x T ) = t, then 
it solves flM^D with S — p(Fx T ,y). 

2. If there exists a unique solution x$ of flM^D which fulfills p(Fx$,y) = 5, 
then it solves JIT] ) with t — R{xg). 



3. If there exists a unique solution x a of ( |T a [ ) then it solves JI7I ) with r = 
R{x a ) and QM^D with S — p(Fx a ,y). 

Proof. 1. With 6 = p(Fx T ,y), it is clear that x T is feasible for the opti- 
mization problem JM^D and the objective value is R(x T ) = r. Assume 
that there is a solution x ^ x T of JMgD with R(x) < t. Then, x would 
be feasible for jF]) with objective p(Fx,y) < 8 = p(Fx T ,y) which is a 
contradiction to the uniqueness of the solution x T . 

2. The proof mimics the proof of the first statement. 

3. Again, assume that there exists a solution x ^ x a of JfTI ). Then, one sees 
that T a _ y (x) < T aj y(x a ) contradicting the uniqueness of x a . The proof is 
similar for the last claim. 

□ 
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We remark that the missing implications in Theorem l2.3l are not true without 
additional assumptions. 

Example 2.4 (Unique Ivanov and Morozov minimizers need not to be Tikhonov 
minimizers). We illustrate this by a simple one-dimensional example: Let X — 
Y = M, F = id and consider the regularization functional R(x) = \x + l\ 
(saying that the solution should be close to —1) and as discrepancy functional 
the so-called Bregman distance with respect to the strictly convex function 
x i y x 4 , i.e. p(x,y) — y 4 — x A — Ax 3 (y — x). We choose r = 1 and y = 1 and 
obtain x T = as the unique solution of ( |L~1 ) (which is also the unique solution 
of flMgD with (5 = 1). But there is no a > such that x — is a minimizer of 
T a ,x{x) = p(x, 1) + a\x + 1| (cf. Figure [[]). 

In the above examples it holds that stationary point of the mapping 

x i — ^ p{Fx,y). Moreover note that the precise form of R is not important in 
this example, several other R with i?'(0) > would also work. Indeed, we can 
deduce from the next proposition that it is necessary for x T to be also a (local) 
Tikhonov minimizer that not both of these properties are fulfilled. 

Proposition 2.5. Let M be a variational scheme and let X be a Banach space 
and let y £ Y. Furthermore, assume that the mappings f(x) = p(Fx,y) and R 
are Gateaux differentiable at x* G X. 

If x* is a local minimizer of T a-y for some a > then for every v it holds 
that 

-aR'(x*)v < f'(x*)v. 
Moreover, if R'(x*) ^ 0, then f'(x*)^0. 

Proof. Since x* is a local minimizer of T aiV , it holds, for e > small enough, 

a{R{x*) - R{x* + ev)) < f(x* + ev) - f(x*). 

Dividing by e and passing to the limit e — > proves the first assertion. If 
R'(x*) ^ then there is v such that R'(x*)v < and it follows that f'(x*)v > 0; 
hence, f'(x*) ^0. □ 

In other words: If we have a solution x* of JL~1 ) with R'(x*) ^ which is 
also a local minimizer of T Qjl) then it is not stationary for x i— > p{Fx,y). 

Note that result similar to Proposition 12.51 can be found in [571 Thm. 4.13]. 



G 



Remark 2.6. Under convexity assumptions on f(x) — p{Fx,y) and R one can 
show that Ivanov minimizers (or Morozov minimizers) are indeed also Tikhonov 
minimizers for some parameter a > if they are not minimizers of the con- 
straint. This is related to the fact that the subgradients of convex functions 
describe the normal vectors to the sublevel sets of the respective function, see 

e.g. [sg. 

Although the variational problems $T a l , flLTl ), and QM^D share their solutions 
under the circumstances presented above, they often differ with respect to their 
practical application. 

It has been remarked already in early works (see, e.g., [3T]) that Ivanov 
and Morozov regularization are related to different types of prior knowledge 
on the exact equation Fx ex&ct = y cxact . Morozov regularization is related to 
prior knowledge about the exact data or, more precisely, the noise level of 
the available data y, i.e., upper estimates on the quantity p(y CK&ct ,y). Ivanov 
regularization, however, is related to prior knowledge about the exact solution, 
more precisely, about upper estimates about the quantity i?(a; cxact ). 

Hence, the choice between Morozov and Ivanov regularization should be 
based upon the available prior knowledge at hand. 

However, there are more factors, which should be taken into account when 
choosing the variational method, namely the factors of tractability and compu- 
tational complexity. The three optimization problems QT a D , ( [171 ), an d (jMj| may 
belong to different "subclasses" of optimization problems and their solution may 
have different computational complexity. 

Example 2.7 (Linear problems in Hilbert space). In this classical setting, X and 
Y are Hilbert space, F is bounded and linear and we use p{Fx, y) = \\Fx — y\\y 
and R(x) — ||a;||^-. In this case, the Tikhonov problem has an explicit solution 
x a = (F*F + aid)~ 1 F*y which can be treated numerically in several convenient 
ways (since the operator which has to be inverted is self adjoint and positive 
definite) . 

However, for both Ivanov and Morozov regularization no closed solution ex- 
ists in general and one usually resorts to solving a series of Tikhonov problems, 
adjusting the parameter a such that the Ivanov or Morozov constraint is ful- 
filled nsj. 

Example 2.8 (Sparse regularization). We consider regularization of a linear op- 
erator equation Ku = g with an operator K : £ 2 — > Y with a Hilbert space Y by 
means of a sparsity constraint [7l lT8lf24] . In this setting one works with the dis- 
crepancy functional p(Ku,g) — o\\Ku — g\\^j and the regularization functional 
R(x) = \\u\\i (extended by oo if the 1-norm does not exist). In this case: 

• Tikhonov regularization consists of solving a convex, non-smooth, and 
unconstrained optimization problem (it is a non-smooth convex program, 
however, with additional structure), 

• Morozov regularization consists of solving a non-smooth and convex opti- 
mization problem with a (smooth) convex constraint (and it can be cast 
as a second-order cone-program), and 

• Ivanov regularization requires solving a smooth and convex optimization 
problem with a non-smooth convex constraint (it is a quadratic program) . 
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Looking a little bit closer on this classification and the properties of p and 
R we observe that Ivanov regularization gives in fact the "easiest" problem 
since it obeys a smooth objective function and a constraint with a fairly easy 
structure (e.g. it is easy to calculate projections onto the constraint). On the 
other hand, the Morozov problem is "difficult" since it involves a non-smooth 
objective over a fairly complicated convex set (in the sense that projections onto 
the set H-Kit — g\\ < 5 are hard to calculate). Indeed, this rationale is behind 
the SPGL1 method [3"3"Il3"l] : It replaces the Morzov problem with a sequence of 
Ivanov problems, solving each by a spectral projected gradient method, resulting 
in one of the fastest methods available for Morozov regularization. 

It has been observed in different contexts, that the Ivanov approach yields 
faster algorithms than the Tikhonov approach in this case, e.g. the basic pro- 
jected gradient method for Ivanov problems generally outperforms the basic 
iterative thresholding algorithm for Tikhonov problems [5]. 

In conclusion, the choice between the three variational methods should be 
based on the available prior knowledge and also on the tractability and the 
complexity of the corresponding optimization problem (often leading to a com- 
bination of two methods). 

3 Necessary conditions for Tikhonov schemes 

In this section we analyze regularization properties of the Tikhonov method. 
First we formalize our requirements for a scheme to be regularizing in the 
Tikhonov case. As usual we formulate conditions on existence, stability and 
convergence of the minimizers, cf. |29j . 

Definition 3.1 (Tikhonov regularization scheme). A variational scheme Ai is 
called Tikhonov regularization scheme, if the following conditions are fulfilled: 

(Rl) Existence: For all a > and all y e Y it holds that argmin^g^ T a y (x) ^ 
0. 

(R2) Stability: Let a > be fixed, y n ^ y and x n £ argmin^g^ T a ^ n (x). Then 
(x n ) converges subsequentially and for each subsequential limit x of (x n ) 
it holds that x € argmin^g^ T ayV {x) . 

(R3) Convergence: Let Fx = y have an exact solution such that R(x) < oo 

and y n —> y. Then there exists a sequence [a n ) n of positive real numbers 
such that x n G argmim.g^- T antVn (x) converges subsequentially and every 
subsequential limit a; is a p-generalized i?-minimal solution of Fx = y. 

3.1 Trivial necessary conditions 

First we list fairly obvious necessary conditions to be regularizing in the Tikhonov 
sense. To that end, we introduce the solution operator 

A :rx]0,oo[^ 2 X 

(y,a) i y argminTa.^x). 
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for the Tikhonov problem QT a p . For fixed a > we denote A a (y) = A(a, y). We 
consider A and A a as set valued mappings and use the respective notation (see, 
e.g., [3D])) especially the notion of the domain dom^l Q = {y £ Y : A a {y) ^ 0} 
and the graph gr(A a ) = {(y,x) £7x1 : x G A a (y)}. 

Moreover, we recall that a topology is called sequential if it can be described 
by sequences, i.e. every sequentially closed set is closed. 

Remark 3.2. Let M be a variational scheme. Then obviously (Rl) is fulfilled 
if and only if dom^4 = Y"x]0, oo[. This is the case if and only if for every y it 
holds that domi? n F _1 (domp(-, y)) ^ 0, especially range F (~l domp(-,y) ^ 0. 

Theorem 3.3. Let M. be a variational scheme that fulfills (R2), a > and 
y EY . Then 

1. A a {y) is sequentially compact and so is {x n : x n £ A a (y n )} ^ A a {y) for 
every sequence (y n ) in Y such that y n A y. 

2. The implication 



does hold, i.e. the mapping A a is sequentially closed. 

If S is induced by a topology r and tx x t is sequential, then gr(„4 Q ) is 
closed for every a > 0. 

If A a is single valued, then (R2) does hold if and only if A a is continuous 
w.r.t S and the sequential convergence structure of Tx- 

Proof. Let (x n ) be a sequence in A a {y) and consider the constant sequence 

y n := y. Then y n — > y and x n £ A a (y n ) do hold. Therefore (R2) implies 
the existence of a convergent subsequence of (x n ) converging to an element of 



3.2 A closer look on the data space 

There exists a vast amount of settings that provide sufficient conditions for a 
Tikhonov scheme with non-metric discrepancy term to be regularizing. Here we 
start from a theorem which is extracted from [TTirT2"ll2"7] . 

Theorem 3.4. Let M — ((X , tx) , (Y, ry ,S), p, R, F) be a variational scheme 
that fulfills the following list of assumptions: 

(Al) The sublevelsets {x E X : R(x) < M} are sequentially compact w.r.t tx 
for all M > and R is sequentially lower semicontinuous 

(A2) domT Q , 9 ^ for ally eY 

(A3) (x,y) h4 p(Fx,y) is sequentially tx X ry lower semi continuous 
(Alt) The sequential convergence structure S is given by 



Vn -> V 

TX 

x n g A(a,y n ) 




A a (y). 



□ 



Vn -> V if and only if p(y, y n ) -t 
and furthermore it fulfills 

Vn ~* V implies p(z, y n ) — > p(z, y) for all z G dom p{ - ,y) 



[CONV] 



[CONT] 
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(A5) y n A- y implies y„ ^4 y 

Then M. is a Tikhonov regularization scheme. 
Proof. We will only give a sketch of the proof, for details we refer to [T2"l[lT?ll2"T] . 



Since p and R are nonnegative (Al) (A3) imply (Rl) (existence of minimiz- 
ers). 

Let (x n ) be a sequence of minimizers as in (R2). Then (R(x n )) is bounded 
due to |(A2)| and [CONT]. Hence, pl)| delivers a convergent subsequence. Let 
x be the limit of such a subsequence. Then, |(A5)| |(A3)| and [CONT] yield 
T a , y (x) < T a: y(x) for all x £ X. Consequently, (R2) is fulfilled (stability). 

Let Fx^ = y, R(x^) < oo and (y n ) be a sequence such that y n A y. Then, 
due to [CONV], there exists a n such that 

a n -> and p ^ V,Vn > _> Q as n -> oo (1) 



does hold (e.g. a n = y/p{y,y n ))- 

Therefore R{x n ) < —T ant y n (x^) for x n G argmin^gj^ T an<yn (x) and together 
with | ( Al ) | this yields subsequential convergence and R(x) < R(x*) for every 
subsequential limit x. Using [CONV] we get p(F(x n ),y n ) —> 0, which yields 
p{Fx~,y) = due to [(A5)] and [(A3)| □ 

Remark 3.5. In [57] it is additionally assumed that p(z,y) = implies y — 
z. This allows to formulate (R3) with R- minimal solutions in the strict sense 
(i.e. with Fx = y) instead of p-generalized i?-minimal solutions. 

In itcm |(A4"J| it would be sufficient if [CONT] only holds for z G dom p( ■ , y)(~) 
F(X). 

As remarked earlier, it is hard to obtain necessary conditions for a general 
Tikhonov scheme to be regularizing. Hence, we have chosen to start with the 
analysis of the data space Y. This is motivated by the fact that there are three 
different objects that pose additional structure on Y, namely the topology ty, 
the sequential convergence structure S and the discrepancy functional p. Obvi- 
ously, not every choice of these three objects will lead to a regularization scheme. 
We start from Theorem EU and the conditions [CONV], [CONT] and pi)] and 
investigate the interplay of ty, S and p and deduce necessary conditions on 
their relations. We are aware that the conditions [CONV], [CONT] and pi)] 
are not necessary for a scheme to be regularizing, but they appear as natural 
conditions in the context of regularization. However, we will get two different 
topologies that, under appropriate circumstances, provide exactly the desired 
convergent sequences, both given in a constructive way. Applied to specific 
classes of discrepancy functionals this could allow a deeper structural insight on 
what [CONT] does really mean and may tackle a subclass for which Theorem 
13.41 is eligible without further adaptions. 

Now we define the two topologies mentioned above, the first one designed to 
satisfy [CONV], the second to satisfy [CONT]. 

Definition 3.6. Let Y be a set and p : Y x Y — > [0, oo] such that p(y, y) = 
for all y G Y. 
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1. We call 

B§(z):={yeY : p(z,y)<e} 
the e-ball w.r.t p centered at u and set 

t p :={UCY : yzeU3e>0 such that B p £ {z) C U} . 

2. Let Z, Y C Y and [0, oo] be equipped with the one-point compactification 
of the standard topology on [0, oo[. For zeZwe define 

f z : Y -> [0, oo] by y ^ p(z, y) . 

By tin we denote the initial topology on Y w.r.t the family (f z ) z ^z i-e. 
the coarsest topology on F for which all the / z are continuous. 

Note that the notation tin does not reflect the dependency on Y and Z. 
Hence, throughout the paper we will always mention explicitly the involved Y 
and Z . 

Remark 3.7. The two additional sets Z and Y are introduced to allow to model 
a broader class of discrepancy functionals and to construct a larger variety of 
topologies. First, note that there are non-symmetric discrepancy functionals 
and even ones in which the domains of p(-, y) and p(z, •) differ. Especially, both 
arguments of p have different meanings: The first argument takes images of solu- 
tions x under F which can have additional structure (e.g. due to discretization), 
while the second argument takes measured data which may also have additional 
characteristics. Moreover, a smaller Z will allow for a coarser topology (and 
this will be helpful if the range of F is a "small" set) and a smaller Y can model 
only a restrictive set of possible data (e.g. strictly non- negative one). 

If Z = Y = Y then tin does satisfy [CONT]. Note that continuity of all the 
f z on the whole of Y is stronger than required in [CONT]. There we only need 
continuity of f z in Y for all z G domp( ■ , y). 

For the reader's convenience we recall some properties of the topologies r p 
and tin that will be used in the further course of the paper. 

Lemma 3.8. The following properties hold for t p : 

1. t p is a sequential topology. 

2. A mapping from Y to an arbitrary topological space is T p -continuous if 
and only if it is sequentially continuous w.r.t t p . 

3- p{y,y n ) ->• implies y n ^ y. 
The following holds for tin: 

4- For arbitrary Z, Y C Y sequential convergence w.r.t tin can be character- 
ized as follows: 

Let (y n )n£N be a sequence in Y and y G Y . Then y n ^ y if and only if 
P(z, Vn) -» p(z, y) for all z E Z . 

5. If additionally Y C Z does hold, y n T ^ y implies p(y, y n ) — > 0. 
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Proof. For[7J see [TJ §2.4]. Item[H is a direct consequence of r p being sequential 
and[3J is clear from the dcfinitition open sets w.r.t. t p . Then, the first implica- 
tion of [7} is due to the sequential continuouity of continuous maps, the converse 
holds because the set {/j 1 (V r ) : z 6 Z, V C [0, oo] open} is a subbase for tjjv 
Finally, [5[ is the continuity of f y at y. □ 

Now we investigate the relation of t p to the property [CONV]. 

Theorem 3.9. Let t be a topology on Y . Then the following does hold: 

1. The property 

p(y, y n ) -t implies y n A y (2) 
does hold if and only if t is coarser than t p . 

2. If t has property [CONV], then so does t p . In particular t p is the finest 
topology with that property. 

Proof. 1. Let r be coarser than r p , then every r p -convergent sequence is also 
r-convergent, and therefore ^ does hold. 

Now let r be a topology where ([2]) does hold. Suppose there exists U € r 
and U $ t p . Then there is an u G U such that for all n £ N there exists a 
Vn £ <B i ( u ) \ f • Evidently y n ) — >■ does hold and therefore y n ^ u 
in contradiction to y„ ^ /7. 

2. Let ry be a topology that fulfills [CONV] and (y n ) a r p convergent se- 
quence with limit y. Due to 1. r is coarser than r p , therefore y n y and 
consequently p(y,y n ) ->■ 0. 

□ 

So, if 5 is induced by a topology at all, this is also done by the relatively 
well-behaved (i.e. sequential) topology t p . This is, e.g., the case if S provides 
unique limits since a sequence iS-converges given all its subsequences have a 
subsequence tending to the same limit (see e.g. [31 Prop. 1.7.15], [2"3"]V 

Corollary 3.10. If there is a topology t where p(y,y n ) implies y n A y 
such that [CONT] is fulfilled, then r p also fulfills [CONTj. 

Since we are only interested in sequential convergence, this allows us to take 
t p as a sort of model topology. 

Remark 3.11. In general, the set T5 of all sequentially open sets w.r.t to a 
sequential convergence structure S on Y is a topology on Y. As has been shown 
in [TJl Prop. 2.10], in the case that 5 is given by [CONV], it is sufficient for 
[CONV] to hold for the topolo gy t s as well, that S fulfills [CONT]. 

Therfore assumption |(A4)| implies that r p also has [CONV] and this again 
implies that 75 = r p , since r p is sequential. Moreover, in this case the sets B§(y) 
are open for all e > 0, y € Y (see also [12] ) and therefore constitute a base for 

Tp. 

The next theorem deals with the question what consequences it has if [CONT] 
does hold in r p . 

Theorem 3.12. Let Z C f] yeY domp( ■ , y) be nonempty andY = Y. 
If t p fulfills [CONTJ then the following does hold: 
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1. tin is coarser than r p 

2. If Z = Y then r p and tin both satisfy [CONVJ. In particular they have 
the same convergent sequences. 

Proof. 

1. Since p(z, ■ ) is sequentially continuous for all z £ Z, it is also continuous 
and therefore tin is coarser than t p . 

2. Due to 1. convergence w.r.t. t p yields convergence w.r.t.. tin and hence 
PiUiVn) implies y n y. Since Y C Z the converse is also true and 
therefore tin satisfies [CONV], and so does r p . 

□ 

Remark 3.13. If Y C Y and (tA^ is sequential (e.g. if Y open or closed w.r.t 

t p , see [12), then (r p )|y = r P y does hold. In a setting where Y C Z C Y, this 
together with Theorem 13.121 would still guarantee, that tin and the subspace 
topology of T p on Y provide the same convergent sequences. 

If p(z, ■ ) is r p -continuous at every y G Y for all z G Z regardless of 
the finiteness condition in [CONT], then we can drop the assumption Z C 
C\y£Y domp( • , y) in Theorem 13. 121 

So, in the setting of Theorem 13.121 sequential convergence in r p and tin 
coincides. In general the sequential convergence structures of these topologies 
can be different from each other. 

3.3 Application to Bregman discrepancies 

We conclude Section [3] by an application to a special class of discrepancy func- 
tionals, namely ones that stem from Bregman distances which appear, e.g., in 
the case of Poisson noise or multiplicative noise [5]. Especially, this gives an 
example that illustrates how Theorem 13 . 1 2 1 can be used to gain necessary con- 
ditions on the discrepancy functional for Theorem 13.41 to apply. Also we treat 
the question, when p{y\,y2) = implies y\ — y<x in this case. 

In the following let V be a Banach space and J : V — » [0, oo] proper, convex, 
Z = Y = dom J and Y C {y g Y : J Gateaux differentiable at y}. By VJ we 
denote the Gateaux derivative of J. As distance functional p we consider the 
Bregman distance w.r.t. to J. Consequently, for (z,y) G Y x Y it takes the form 

p(z, y) = J{z) - J{y) - (V J(y), z - y) . 

The following lemma explores how convergence w.r.t tin does actually look like 
and when p(yi,y2) = does hold. 

Lemma 3.14. For all sequences (y n ) in Y , y G Y the following does hold: 

1. y n T ^ y if and only if p(y,y n ) -> and (VJ(y„) -VJ(y),y - z) ->• for 
all z G Z 

2. If Y is a linear subspace ofV, then y n y if and only if p(y, y n ) — > and 
VJ(y n ) — VJ(y) in Y*. In particular (VJ)iy- : Y — > Y* is sequentially 
TiN-weak* continuous. 
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3. Let y 1} y 2 G Y. Then p{yi,y 2 ) = if and only VJ(yi) = VJ(y 2 ). 

Proof. 1. The identity p(z,y n )-p(z,y) = p(y,y n ) + (\7J(y n ) -VJ(y),y - z) 
does hold for all z£Z. 

So clearly p(y,y n ) ->• and (VJ(y„) - WJ(y),y- z) -» imply y„ y. 

Conversely, let j/„ "-^ y hold. Then j/ n ) — >• and hence = lim n ^. 00 (/9(z, 
P0,y) - p(y,Vn)) = lim n ^oo(VJ(y„) - VJ(y),y- z) . 

2. is a direct consequence of 1. 

3. First let p( yi ,y 2 ) = 0. Then J( Vl ) = J{y 2 ) + (VJ(y 2 ), yi -y 2 ) and 
hence linearity of S7J(y 2 ) and nonnegativity of p imply J(v) — J(yi) — 
{\7J(y 2 ),v — yi) = p(v,y 2 ) > for all v G V. Therefore VJ(y 2 ) is a sub- 
gradient of J in yi. Since J is diffcrcntiable at yi this yields VJ(y 2 ) = 
VJ(yi). 

Now let VJ(y 2 ) = VJ(yi). Then > -p(yi,y 2 ) = K«a,tfi) > 0. 

□ 

Corollary 3.15. Lei dom J = V and J 6e differentiable on V and set Y = V. 

1. If t p satisfies [CONTJ, then VJ is r p -weak* continuous. 

2. The property 

p(yi,V2) = yi = yi for all y%,y 2 eY 
does hold if and only if V J is infective. 

3. tjn provides unique sequential limits if and only j/ VJ is infective. 

So, in the setting of Corollary 13.151 it is necessary for Theorem 13.41 to ap- 
ply to Bregman discrepancies that J has r p -weak* continuous derivative and a 
Tikhonov regularization scheme with discrepancy p guarantees convergence to 
an exact solution given J has injective derivative. 

4 Conclusion 

We examined variational regularization in a quite general setting and started 
a study on necessary conditions for variational schemes to be regularizing. Al- 
though it seems like little can be said about necessary conditions in general 
we obtained several results in this direction. Especially, we tried to clarify the 
relations between the different players in the data space, e.g. the convergence 
structure, the topology and the discrepancy functional. Here we started from 
a list of conditions which is known to guarantee regularizing properties and de- 
duced necessary conditions for the topologies and the discrepancy functional. 
For Bregman discrepancies we illustrated that our results imply necessary con- 
ditions for the continuity of the derivative of the functional which induces the 
Bregman distance. 

Although our results are fairly abstract, they are first steps towards the anal- 
ysis of necessary conditions which can be used to figure out essential limitations 
of variational schemes. Next steps could be to analyze the other ingredients of a 
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variational scheme, namely the solution space X, its topology, the regularization 
functional and of course, the operator. Other directions for future research are 
to consider special classes of discrepancy functional with additional structure 
and to extend the analysis to Morozov and Ivanov regularization. 
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